Sample-weighted clustering methods

نویسندگان

  • Jian Yu
  • Miin-Shen Yang
  • E. Stanley Lee
چکیده

Keywords: Cluster analysis Maximum entropy principle k-means Fuzzy c-means Sample weights Robustness a b s t r a c t Although there have been many researches on cluster analysis considering feature (or variable) weights, little effort has been made regarding sample weights in clustering. In practice, not every sample in a data set has the same importance in cluster analysis. Therefore, it is interesting to obtain the proper sample weights for clustering a data set. In this paper, we consider a probability distribution over a data set to represent its sample weights. We then apply the maximum entropy principle to automatically compute these sample weights for clustering. Such method can generate the sample-weighted versions of most clustering algorithms, such as k-means, fuzzy c-means (FCM) and expectation & maximization (EM), etc. The proposed sample-weighted clustering algorithms will be robust for data sets with noise and outliers. Furthermore, we also analyze the convergence properties of the proposed algorithms. This study also uses some numerical data and real data sets for demonstration and comparison. Experimental results and comparisons actually demonstrate that the proposed sample-weighted clustering algorithms are effective and robust clustering methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bilateral Weighted Fuzzy C-Means Clustering

Nowadays, the Fuzzy C-Means method has become one of the most popular clustering methods based on minimization of a criterion function. However, the performance of this clustering algorithm may be significantly degraded in the presence of noise. This paper presents a robust clustering algorithm called Bilateral Weighted Fuzzy CMeans (BWFCM). We used a new objective function that uses some k...

متن کامل

Sample-Weighted Fuzzy Clustering with Regularizations

Although there have been many researches in cluster analysis to consider on feature weights, little effort is made on sample weights. Recently, Yu et al. (2011) considered a probability distribution over a data set to represent its sample weights and then proposed sample-weighted clustering algorithms. In this paper, we give a sample-weighted version of generalized fuzzy clustering regularizati...

متن کامل

A Kernel Fuzzy Clustering Algorithm with Generalized Entropy Based on Weighted Sample

Aiming at fuzzy clustering with generalized entropy, a kernel fuzzy clustering algorithm with generalized entropy based on weighted sample is presented. By introducing weight of sample into objective function for fuzzy clustering with generalized entropy, we obtain optimization problem for fuzzy clustering with generalized entropy based on weighted sample. And we use Lagrange multiplier method ...

متن کامل

Weighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering

Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...

متن کامل

A Weighted Sample’s Fuzzy Clustering Algorithm With Generalized Entropy

Combined with weight of samples and kernel function, fuzzy clustering method with generalized entropy is studied. Objective function for fuzzy clustering with generalized entropy based on sample weighting is obtained. Following that, fuzzy clustering algorithm with generalized entropy based on sample weighting is presented. In addition, by introducing kernel into the presented objective functio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computers & Mathematics with Applications

دوره 62  شماره 

صفحات  -

تاریخ انتشار 2011